Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: ClientBuilder auth info drop before connect #122

Merged
merged 2 commits into from
Sep 21, 2023
Merged

Conversation

fMeow
Copy link
Contributor

@fMeow fMeow commented Sep 20, 2023

Issue description

The ClientBuilder is not properly tested and contains a severe memory leak issue.

The bugyy code is show below:

let fs = unsafe {
    let builder = hdfsNewBuilder();

    let name_node = CString::new(self.name_node.as_bytes())?;
    hdfsBuilderSetNameNode(builder, name_node.as_ptr());

    if let Some(v) = self.user {
        let user = CString::new(v.as_bytes())?;
        hdfsBuilderSetUserName(builder, user.as_ptr());  // DANGLING POINTER
        // user is dropped here, but hdfsBuilder still holds a pointer to it
    }

    if let Some(v) = self.kerberos_ticket_cache_path {
        let ticket_cache_path = CString::new(v.as_bytes())?;
        hdfsBuilderSetKerbTicketCachePath(builder, ticket_cache_path.as_ptr()); // DANGLING POINTER
        // ticket_cache_path is dropped here, but hdfsBuilder still holds a pointer to it
    }
   
    // now the user and ticket_cache_path are used to setup auth, but the underlying CString is already freed
    hdfsBuilderConnect(builder)
};

Solution

The solution is to declare a variable that lives until where hdfsBuilderConnect is called. Here I use MaybeUninit to hold the user and ticket_cache_path string when necessary. This raises (maybe or not) MSRV to 1.36.0, which is released in fall 2019.

What this PR does

  • fix ClientBuilder::connect memory leak issue, properly set user and ticket_cache_path
  • add a unit test to validate user declaration

@fMeow
Copy link
Contributor Author

fMeow commented Sep 20, 2023

Please publish a patch release to crates.io under v0.3.1, and even better yank the previous v0.3.0.

The v0.3.0 provides no mean to properly declare user while the ClientBuilder::with_user() actually set a random non sense.

@fMeow fMeow force-pushed the main branch 3 times, most recently from e0ba504 to 762f85b Compare September 20, 2023 08:13
@fMeow
Copy link
Contributor Author

fMeow commented Sep 20, 2023

The github action always use user runner as hdfs user. This is wierd. I test the same code on my local enveironment and it surely passed.

@Xuanwo
Copy link
Owner

Xuanwo commented Sep 20, 2023

The github action always use user runner as hdfs user. This is wierd. I test the same code on my local enveironment and it surely passed.

hdfs default namenode is backed by local fs, I'm guessing the returning user is also based on local fs's user too.

@fMeow
Copy link
Contributor Author

fMeow commented Sep 21, 2023

I guess the username test should only run on cluster env.

@Xuanwo
Copy link
Owner

Xuanwo commented Sep 21, 2023

Thanks!

@Xuanwo Xuanwo merged commit 64ecf89 into Xuanwo:main Sep 21, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants